Workload-Based Wavelet Synopses
نویسندگان
چکیده
This paper introduces workload-based wavelet synopses, which exploit query workload information to significantly boost accuracy in approximate query processing. We show that wavelet synopses can adapt effectively to workload information, and that they have significant advantages over previous approaches. An important aspect of our approach is optimizing synopses constructions toward error metrics defined by workload information, rather than based on some uniform metrics. We present an adaptive greedy algorithm which is simple and efficient. It is run-time competitive to previous, non-workload based algorithms, and constructs workload-based wavelet synopses that are significantly more accurate than previous synopses. The algorithm also obtains improved accuracy for non-workload case when the error metric is the mean relative error. We also present a self-tuning algorithm that adapts the workload-based synopses to changes in the workload. All algorithms are extended to workload-based multidimensional wavelet synopses with improved performance over previous algorithms. Experimental results demonstrate the effectiveness of workload-based wavelet synopses for different types of data sets and query workloads, and show significant improvement in accuracy even with very small training sets.
منابع مشابه
Optimal Workload-Based Weighted Wavelet Synopses
In recent years wavelets were shown to be effective data synopses. We are concerned with the problem of finding efficiently wavelet synopses for massive data sets, in situations where information about query workload is available. We present linear time, I/O optimal algorithms for building optimal workload-based wavelet synopses for point queries. The synopses are based on a novel construction ...
متن کاملA Framework for the Physical Design Problem for Data Synopses
Maintaining statistics on multidimensional data distributions is crucial for predicting the run-time and result size of queries and data analysis tasks with acceptable accuracy. Applications of such predictions include traditional query optimization, priority management and resource scheduling for data mining tasks, as well as querying heterogeneous Web data sources with diverse information qua...
متن کاملτ -Synopses: A system for run-time management of remote synopses
τ -Synopses is a system designed to provide a run-time environment for remote execution of various synopses. It enables easy registration of new synopses from remote platforms, after which the system can manage these synopses, including triggering their construction, rebuild and update, and invoking them for approximate query processing. The system captures and analyzes query workloads, enablin...
متن کاملProbabilistic Wavelet Synopses for Multiple Measures
The recently proposed idea of probabilistic wavelet synopses has enabled their use as a tool for reducing large amounts of data down to compact wavelet synopses that can be used to obtain fast, accurate approximate answers to user queries, while at the same time providing guarantees on the accuracy of individual answers. Relatively little attention, however, has been paid to the problem of usin...
متن کاملApproximate Query Processing: Taming the TeraBytes
2 Garofalakis & Gibbons, VLDB 2001 # Outline • Intro & Approximate Query Answering Overview – Synopses, System architecture, Commercial offerings • One-Dimensional Synopses – Histograms, Samples, Wavelets • Multi-Dimensional Synopses and Joins – Multi-D Histograms, Join synopses, Wavelets • Set-Valued Queries – Using Histograms, Samples, Wavelets • Advanced Techniques & Future Directions – Stre...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003